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Overlay modelling is a technique for describing a student's problem 
solving skills in terms of a modular program designed to be an expert for the 
given domain. The model is an overlay on the expert program In that it 
consists of a set of hypotheses regarding the student's familiarity with the 
skills employed by the expert. The modelling is performed by a set of P rules 
that are triggered by different sources of evidence, and whose effect is to 
modify these hypotheses. A P critic monitors these rules to detect 
discontinuities and inconsistencies in their predictions. 

ui.cr>n jy^^ implementation of overlay modelling exists as a component of 
WU50R-II, a CAI program based on artifical intelligence techniques. WUSOR-II 
coaches a student in the logical and probability skills required to play the 
computer game WUMPUS. Preliminary evidence indicates that overlay modelling 
Significantly improves the appropriateness of the tutoring program's 
explanations. 
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\L I. The HIT COACH Project 

A tra^tional argument for computer aided instruction (CAI) has been rhat it is an 
economic: means for providina ftdi|idualized instruction. The rapidly fal]; .j co ls of 
har^are ^ake the ecanoiifttj of «%|Jj*"L^ivily more appealing. But. the extvnt to 
wt^ich exist ingliH^fpn^UledfcersonalizeriiStruction has been limitod. This paper 
develops a procedural theory ofHio5rafl*ifc that can be incorporated into CAI programs to 
address this limitation. 

This theory has been developed as part of the COACH Project at MIT. whose concern 
is the development of Al-based CAI programs for tutoring the skill? rcquirod for 
successfully playing various computer games. The computer serves as nn assi tant to a 
learner who is in the process of acquiring the skills necessary to plu;, the game well. 
Fig. 1 shows a generalized block diagram for these programs, with thr modules given 
anthropomorphic names to indicate their function. To distinguish them ft their human 
counterparts, references to the modules will be capitalized. 

Good coaching is critically dependent on a detailed model of the learntr in that 
the model guides the coach in generating concise and appropriate explanations This 
paper discusses the theory of overlay modelling embodied in the Psychologist mo.iule, 
the component of the Coach responsible for maintaining such models of . player's 
current skills (the IC model) and learning preferences (the L model). The.s.» mod> Is are 
used by the Tutor module to prune complex explanations generated by the F.xpert. Just 
as with a human speaker, the Coach abbreviates its statements by elirni rating those 
facts that are already known by the listener and those facts which are too complex 

A broad treatment of the potential role of computer coaches in education and the 
Issues raised by their design is given in [Goldstein 1977]. Detailed discus ons of 
preliminary implementations and experimental results are provided in [Stansfiold, Carr 
and Goldstein 1976] and [Carr 1977]. Seminal work on Al-based CAI is also ..scribed in 
[Brown et al. 1975; Brown 1976; Collins & Grignetti 1975]. In particuiar, overlay 
modelling is an extension of the issue-oriented approach to student modelling .developed 
by Burton and Brown [1975]. 

Overlay modelling is a technique for recognizing the constituent .kills being 
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exercised by an individual in performing a problem solving task. The kerne idee, is to 

design a modular Expert program for the task, and to explain differences botween the 

behavior of the Expert and the subject in terms of the lack, on t.. player' > p. t, of 

some of the Expert's skills. Thus, a model of the player is a set of hyp loses, aach 

of which records the system's confidence that the player possesses a givon ski;i. Such 

models are called overlays to reflect that fact that the model of tho indi iu.,al is 

basically a perturbation on the Expert's structure. 

Overlays in terras of subsets of the Expert's skills is a simplification of the 
modelling problem in that it does not address situations in which tho student 
has an incorrect skill or an alternative skill. A discussion of this 
limitation is given in section 5. 

Modelling a learner is difficult. However, preliminary evidence wi.h WUSOR-II 

indicates that, at least for the restricted environment of a game and for th limited 

purpose of guiding a tutor, adequate modelling can be obtained from: a rub system 

that accesses multiple sources of evidence, and a critic that detects incon< . otencies 

and discontinuities in the player's behavior. Fig. 2 is a block dia.ran of the 

internal structure of the Psychologist. 
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Section 2 describes the Wumpus game, the experimental domain of the WUSOR-O 
coach. The theory of overlay modelling is developed next (sections 3>4), lollovrd by a 
discussion of its limitations and extensions (sections 5-7), and concluding with our 
experimental program and preliminary results (sections 8). Related litorattire 1$ 
surveyed in section 9. 

2. Wumpus. an Intellectual Game 

The Wumpus game was invented by Gregory Yob [1975] and exercises basic knowledge 
of logic, probability, decision analysis and geometry. Players ranging from children 
to adults find it enjoyable. The game is a modern day version of Thoseus and the 
Minotaur. The player is initially placed somewhere in a randomly connected warren of 
caves and told the neighbors of his current location. His goal is to lo ate the horrid 
Wumpus and slay it with an arrow. Each move to a neighboring cave yields informatico 
regarding that cave's neighbors. The difficulty in choosing a move arises from the 
existence of dangers in the warren - bats, pits and the Wumpus itself. If the player 
moves into the Wumpus' lair, he is eaten. If he walks into a pit, he f . • U to his 
death. Bats pick the player up and randomly drop him elsewhere in tn. warren. 

But the player can minimize risk and locate the Wumpus by makinn the oper 
logistic and probabilistic inferences from warnings he is given. The; warm , are 
provided whenever the player is in the vicinity of a danger. The Wumpus can be su-jelled 
within one or two caves. The squeak of bats can be heard one cave away ... the breeze 
of a pit felt one cave away. The game is won by shooting an arrow into Wumpu-^'s 

lair. If the player exhausts his set of five arrows without hitting tao creature, the 
game is lost. Fig. 3 illustrates a typical intermediate state a player night reach. 

Skilled play exercises knowledge of logic, probability, decision theory and 
geometry. The WUSOR>II Expert uses a rule-based representation of this knowledge, 
consisting of approximately 20 rules, to infer the risk of visiting new caves. 
However, for expository purposes, a simplified rule set consisting oi fiv reasoning 
skills Is sufficient to illustrate overlay modelling. 
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Circled caves have been 
visited by player. 



Sm = WuMPus Warning 
Sq = Bat Warning 
Br = Pit Warning 



FIGURE 3 
AN INTERMEDIATE STATE IN A TYPICAL WUMPUS GAME 






U, (positive evidence rule) A warning in a cave implies that a danyer exists 
in a neighbor. 

L2t (negative evidence rule) The absence of a warning implies that w danger 
exists in any neighbors. 

L3: (elimination rule) If a cave has a warning and all but one oj Us 
neighbors are known to be safe, then the danger is in the remaining 
neighbor. 

Pit (equal likelihood rule) In the absence of other knowlenje, an of .he 
neighbors of a cave with a warning are equally likely to contain a danger. 
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P2'. (double evidence rule) Multiple warnings increase the likelihoou that a 
given cave contains a danger. 

In terms of these skills, an overlay model for a player who K . mastered the 
simple logical rules (L1,L2). is in the process of acquiring L3, and has not yet 
learned P2 is: 



RULES 


APPROPRIATE 


USED 


FREQUENCY 


KH/OW 


LI 


5 


6 


100% 


T 


L2 


4 


3 


75% 


T 


L3 


4 


2 


50% 


? 


PI 


6 


6 


100% 


T 


P2 


4 


1 


25% 


NIL 



Overlay Model 1 

The frequencies are determined by estimates made by the P rules of the nu. ber of times 
a skill has been USED in proportion to the number of times it has been APPRGt>RIATE„ 
The KNOWN variable is set to T, ? or NIL by the P critic. 

The WUSOR-II Coach [Carr 77} maintains models of this kind for guic g its 

explanations to the student. For example, consider fig. 4: Suppose the .yer moves 




FIG. 4: SITUATION 1 — WHAT IS THE BEST MOVE? 



to cave 14, the worst possible move. Given overlay model 1, VAJSOR-II would generate 
the following tutorial advice: 
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Ira, it isn't necessary to take such large risks with pits. One of cave:. 2 
and 14 contains a pit. Likewise one of caves and 14 contains o pit. This 
is multiple evidence of a pit in cave 14 which makes it probable that cave 14 
contains a pit. It is less likely that cave contains a pit. Hence, Ira, we 
might want to explore cove instead. 

Without the overlay model, the explanation would be longer and more r mplox as shown 
below. The WUSOR-II Tutor has pruned the underlined text from the Ex;.erfs complete 
analysis by noting that the student is already familiar with the positive and negative 
evidence rules. 

Ira, it isn't necessary to take such large risks with pits. 

Cave 4 must be next t o a pit because we felt a draft there. Hence , o>:e ol 
caves 16, 2 and 14 contains a pit, but we have safely visited cjue 16 i his 
means that one of caves 2 and 14 contains a pit. 

Likewise cave 16 mus t be next to a pit because we felt a draft there. 
Hence, one of caves 0, 4 and 14 contains a pit, but we have saf elu v ited 
^f^c ^' This means that one of caves and 14 contains a pit. 

This is multiple evidence of a pit in cave 14 which makes it probable that 
cave 14 contains a pit. It is less likely that cave contains a pit. Hence, 
Ira, we might want to explore cave instead. 

Thus, the overlay model has allowed the tutor to focus on explaining the double 
evidence heuristic to the player. 

3. The P Rules 
No single source of evidence is a certain indicator of an individuc.1 's knowledge. 
Hence, the Psychologist is provided with four sources of evidence -- (1) implic it (the 
student's behavior in playing the game), (2) structural (the intr nsic complexity 
relations between skills of the Expert). (3) explicit (the dialog becween tutor and 
player), and (4) background (estimates of how average players of varying backgrounds 
can be expected to perform). 
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In this section, we define the P rules, a set of procedures vvh .h modify th© 
overlay model when triggered by these various kinds of evidence. Section 4 describes 
the P Critic whose function is to set the Kmm variable on the basis of the history of 
changes to USED and APPROPRIATE. In these sections, our example is the creation and 
maintenance of the K model, an overlay on the Expert. [Goldstein 77] describes the 
application of overlay techniques to the creation and maintenance of the L model, an 
overlay on the Tutor. 

Implicit Evi^encji The student's play yields implicit evidence regarding his 
mastery of various skills. The Expert evaluates the merits of the player's move 
relative to the available alternatives. The assumption is that the player has learned 
those skills involved in choosing his particular move and rejecting its inferiors, and 
has yet to learn those skills needed to recognize superior moves. 

The implicit evidence rules utilize the Expert's analysis as follows: 

P'llt If skill S is involved in an overlooked superior move and not in the 
current move, then increase APPROPRIATE by C(S) and recomputt the 
frequency, 

P'IZt If skill S is involved in the current move and not a rejected inj ior, 
then increase USED and APPROPRIATE by C(S) and recompute the Jrequcuy. 

where C(S) is a complexity factor ranging between an i that 
decreases as the skill becomes more complex relative to iiic student's 
current knowledge state. C(S) is defined in the next sed.ion 

For example, in situation I the Expert reports to the PsycholoQist tnat caves 
and 2 are better than 14 on the basis of double evidence (P2). If the player chooses 
14, the Expert's analysis triggers P-Il which increments APPROPRIATE but not USED. 
(The FREQUENCY of ?Z therefore drops.) On the other hand, if the playor cho- ; cave C 
or 2. then P-I2 would be triggered and both USED and APPROPRIATE for P2 would increase. 

The Expert also reports to the Psychologist that cave is better than cave 2 
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because of the known bat in the latter. Hence if the player chooses 2, P-Il is 
triggered and the frequency of use of L3 drops, while choosing has the opposite 
effect. 



Structural Evidence; Clues to the student's knowledge arise from an analysis of 
the intrinsic structure of the skills to be conveyed. This analysis of the Expert's 
Skills is stored as the Syllabus , a network linking the skills in terms of their 
complexity and dependencies. Fig. 5 is a simplified Wumpus syllabus for the five 
reasoning skills introduced earlier. 



LI (+ evidence) 




-> PI (= LIKELIHOOD)-- 



P2 (double evidence) 



LI (-evidence) ►L3 (elimination) 



FIG. 5 — A SIMPLIFIED WUMPUS SYLIABUS 



structural knowledge suggests that given a student familiar with a certain region 
of the syllabus (as indicated by the K model), it is more likely that a new skill being 
acquired is at the frontier of this region rather than deep into unknown territory. 
WU50R-II implements this heuristic in a conservative fashion: C(S) is set to zero for 
every skill more than one away from a known skill. WUSOR-II thus ignor : the possible 
employment of skills not at the frontier. 

We currently believe that this is too conservative. It assumes that skills can 
only be learned in the order in which they appear in the syllabus. Such an assumption 
is too strong as the syllabus is only a guideline. Double evidence .light be employed 
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despite non-mastery of the elimination strategy. Hence, our current plans call for 
redefining C(S) to decrease in proportion to how far a skill is from the student's 
current knowledge state, 

C(S) - 1 where D(S) is the distance 

---.....-,. of S from the farthest known 
D(aS) skill. 

D(S) is the distance from the farthest, not nearest known skill since the use of .S 
depends on all the skills linked to it. S may be linked to several skills, all but one 
of which are known. The unknown skills then becomes the critical piece of knowledge. 

For example, consider again a player moving to cave in situation 1. The implicit 
evidence rule P-I2 is triggered by the apparent use of double evidence P2 . But witti 
respect to overlay model 1. the earlier skill L3 has not yet been learned, .ience, D(S) 
« 2 and therefore the change to USED and APPROPRIATE is reduced by 50%. The 
Psychologist is being cautious in interpreting this apparently advanced behavior as 
evidence of a non-local improvement in the player's skill. However, this p( >Jbility 
is not ignored. 

Explicit Ev idence: Another source of evidence can be obtained from the player's 
response to questions asked by the Tutor. This capability is not currently implemented 
in WUSOR-II. We have plans to implement a facility for the Tutor to obtain explicit 
evidence by asking the student two types of questions: test case^ and > iJom^ U£ 
questions. 

In a test case question, the tutor will ask the student to order the roves for the 
current board state or a test case. Analyzing the response reduces to the Implicit 
Evidence case, except that there is a larger window into the player's reasoning. The 
Psychologist need not guess that the student has overlooke d superior mov. and rejec ted 
inferior moves: the evidence is explicit in the requested ordering. Tiu possibility 
that the student has forgotten to consider one alternative (which might happen in a 
complex game situation) is precluded. For example, situation 1 might serve as a test 
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case In conjunction with the following question: 

yhich of the following statements do you agree with most; 

(1) Caves 0, 2 and 14 are equally safe. 

(2) Caves and 2 are equally safe, but cave 14 is more dangerous. 

(3) Cave is safer than both 2 and 14. 

The second kind of explicit evidence will be derived from follow u p questions that 
ask the student to choose among a set of possible rationales for why the current move 
was chosen. Rule P-El will monitor this source of evidence. 

P'El, If a player chooses the wrong rationale, then increment APPnoPRIATE by 1 
for each skill S involved in the correct rationale but absent in the 
chosen rationale. 

For example, a follow up question to a move to cave in situation 1 light be: 

Vfhich of the following explanations apply t 

(1) Caves 0, 2 and 14 are equally safe. 

(2) Caves and 2 are equally safe, but cave 14 is more dangerous because 
there is double evidence for pits in 14 and only single evidenc<^ for 
and 2. Otherwise and 2 are the same. 

(3) Cave is safer than 2 because there is a bat in 2 but no bat in 0. 

Background Evidence: Every teacher has expectations about the performance of a 
student on the basis of that student's background. This estimate changes a. experience 
with the student is acquired, but it provides a useful starting point. 

In the first implementation of the WUSOR coach, the Psychologist asked the player 
to classify himself his level of skill as either -novice", -amateur", "advanced- or 
-expert". Each of these skill levels corresponded to a different initialisation for 
the overlay model. 

We are currently experimenting with a set of background rules that associate 
different starting states for the overlay model with different replies to a 
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questlonanaire presented to the player at the beginning of his first game. These rules 
are triggered by the player's age and experience with the garae. For example, the three 
rules for a secondary school player are: 

P'Cl, If the player is in secondary school with no previous experience, then 
initialize the K model to AMATEUR, i.e. familiarity with the skills LI 
(* evidence) and PI (^ likelihood). 

P'CZt If the player is in secondary school and has had 1-10 games experience 
without coaching, then initialize the K model to ADVANCED, i.e. assume 
familiarity with LI, L2. L3 and PI. 

P-CSt If the player is in secondary school and has had over 10 yames 
experience without coaching, then initialize the K model to EXPERT, i.e. 
assume familiarity with all LI, L2, L3, PI and P2. 

Similar rules are used for pre- and post-secondary school players. The rules 
associate naturally bounded portions of the syllabus (as determined by dependency and 
complexity criteria) to various age and skill backgrounds. We do not yet h..ve enough 
experience with these background rules to know whether the categories of experience we 
have chosen are reasonable. We plan to acquire this experience studying whether the 
implicit and explicit rules find a particular background skill estimate, on the 
average, too high or too low for players of a given background. 

4. The P Critic 

The Psychologist maintains a history of changes to the USED and APPROPRIATE 
variables in order to detect inconsistencies and discontinuities. Incons stencils are 
evidence that the P rules are failing to model the student properly, while 
discontinuities are indications of a change in the players knowl. sta The P 

Critic makes these decisions. 

Fig. 6 is a history graph for skill S. The graph is ideal in the sense that the 
Player consistently fails to use skill S in situations Judged appropriate •> the 
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FIG. 6 - GRAPH FOR SKILL S 



Expert, until point X at which he thereafter consistently employs the skin. TLere is 
no occasional use of the skill. The P Critic would set »IN0WN to T shortly after point 
X. 

Real situations are not this clear cut; hence a certain tolerance is allowed as 
shown in fig. 7. A slope of zero to 10 degrees results in KNOWN being se*. to NIL. A 
slope of 35 to 45 degrees degrees is sufficient for the critic to set K^'CWN to T. 
Between these two regions, ICNOWN is set to ?. 

•?• reflects uncertainty on the part of the Psychologist. The student may be In 
the process of acquiring the skill, and not yet able to use it consistently Or the P 
rules may be failing to model the student properly. 

When ICNOWN = ?, t^e Tutor module of the Coach becomes cautious about assuming that 
the student knows the skill even in situations where the student chooses the proper 
move. Explicit evidence is sought by means of follow up questions in- ring ubout the 
student's rationale. In the event that no clarification is obtained, i.e. the student 
Is inconsistent even on these questions, the Tutor will ultimately ignore the 
Psychologist on this skill. The result is that the Coach is reduced to providing 
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USED 



T REGION 




? REGION 

NIL REGION 
— ►APPROPRIATE 



FIG. 7 - P CRITIC REGIONS 



explanations generated by the Expert (when the student makes a non-optimal move) that 
are unpruned with respect to this skill. 

5. Limitations 

The modelling being conducted by the Psychologist rests on the assumption Lhat the 
skills employed by the student are a subset of those of the Expert. Thi is not 
inevitable for at least three reasons. First, the student may be solvimj oblem- in a 
fashion completely divergent from the Expert - there can be multiple paradigms for the 
particular problem domain. Second, the student may be using a non-opt..ial method for 
his own reasons. A Wumpus player may be more concerned with finishing .iiickly than 
avoiding risk, and hence choose a move to a more informative cave, aspit*. greater 
risk. Third, the student may possess a skill of the Expert in an inc rrec. form, 
perhaps using it inappropriately. 

We have sought to make the Expert a useful foundation . moGc ling by 

Imposing certain design criteria on its design. The major one is th. jse _ a rule 
system to represent the heuristics commonly employed by skilled players. ...is c .proach 
has been profitably employed in the medical domain [Shortliffe 74] and we ^.imllarly 
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find it a useful framework in which to nodularly represent human skill. By 
interviewing skilled players and by introspection, we have evolved a rale system whose 
reasoning is acceptable to skilled players as capturing the essential ingredients of 
their own analyses. In this fashion, we have constructed an Expert for the game of 
Wumpus that provides a reasonable basis for modelling. 

For the restricted decision making environment of Wumpus, we have not encountered 
multiple problem solving paradigms nor have we found it common for a studen. to ignore 
the basic strategy of choosing the safest unvisited cave. However, for othe domains 
such as mathematical problem solving, the possibility of multiple models of expertise 
exists. It is a fundamental limitation of overlay modelling that a ;aayer annot be 
modelled who employs a logic not understood by the Expert. Indeed, « huma. teacher 
cannot understand a student reasoning in a legitimate fashion unknov to the teacher. 
The power of a successful teacher arises from knowing multiple me ns for 5 Iving a 
given problem, and hence being sensitive to the particular choice made by the stt^dent. 
The same possibility is available to the Coach, if a Meta-Expert is provir A Meta- 

Expert is a set of Experts for the given task, each modular, articL te. and 
comprehensible; and each capable of supplying a move analysis 1 rom its own 
perspective. 

With a Meta-Expert, the Psychologist can attempt to identify wr.ich I xpert the 
Student most closely approximates. The evidence distinguishing the experts derives 
from those situations where the predictions of the Experts differ. However, the cost 
of multiple experts is one more source of uncertainty. We have avoided this difficulty 
to date by choosing a tutoring situation - Wumpus - where there is broad agreement 
upon the part of Expert players as to the necessary skills. The design of Coaches with 
a Neta-Expert module is a future research goal. 

Meta-experts, however, do not address the modelling difficulties ari^ q when the 
student employs a skill in an incorrect form. For example, we have found s.r^o students 
to employ the positive and negative evidence skills for bats and pit. out noc for the 
Wumpus. The reason presumably is the greater simplicity resulting fro. the fact that 
bat and pit warnings propagate only one cave, while the Wumpus warning propagates two 
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caves. We have addressed this problem by not organizing the Expert around the most 
general set of skills. Rather positive and negative evidence has been r* presented by 
-micro" skills, one set for the 1-cave warnings of bats and pits and the , > >r set for 
the two-cave warnings of the Wumpus. Our philosophy has been .0 oreak .he skill 
analysis into sufficiently simple rules that a model which only recor . thei. presence 
or absence is sufficient. 

It would be better to have a general theory of learning that suggesteo typical 
bugs that might occur in learning a given skill. In other research, the Co.c:. project 
has studied the theory of bugs in relation to different kinds of plan. [Miller & 
Goldstein 1976]. But in Wumpus the overall plan is simple - find the relative dangers 
of the cave. Hence, we find that we are able to model the student without an elaborate 
bug analysis. Future research will seek to couple a theory of debuggir.j to the theory 
of overlay modelling. 

Given this analysis of the fundamental assumptions of overlay modelling and its 
limitations, there are clearly four situations where such modelling will fail. These 
are situations in which the underlying assumptions of these modelling rules are 
violated, 

^" Extreme Inconsistency on the part of the player: the P critic will 

ultimately set the KNOWN variable of all skills to "?". 
^' Mnrecognized Expertise employed by the player: again the P critic will 

ultimately turn off the Psychologist, unless a Meta-Expert is available. 
^' Player Explanati ons in C omplex Verbal Form: natural language comproh nsion 

in the Coach is not yet implemented. Explanations expressed in English by 

the player are not allowed. 
^^ distinguishing first order from second order hng . that is, distinguishing 

the complete absence of a skill from its inappropriate use. Tost quastior 

help in this situation, but are not always sufficient. 

However, these situations would also task the abilities of a human teacher. 
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Despite these limitatons, overlay modelling remains useful for two reasons. The 

first is that overlay modelling in its relation to explanation is essentially a 

^^"g"^^^^^ ^'^^^''y °f ^^^ Speaker. Each of us, when formulating an explanation. 

abbreviates the explanation in accord with our model of the listener. ThiL model is 

based on our analysis of the listener's behavior in terms of the knowledge we believe 

is relevant. Overlay modelling performs a similar function for the Coach. A human 

speaker or computer coach may have a mistaken model of the listener, but ultimately a 

person or computer can judge another only in terras of what he, she or it kno.v. itself. 

The second reason arises from the special demands of the educational context. The 

Coach is not an impartial observer, but rather has the goal of conveying Us style of 

expertise. Hence, its insight into the student can be useful, even if li.:,ited to 

hypotheses regarding which aspects of its expertise the student possesses. I . goal is 

to convey that style it knows about; its modelling is to determine how .^luch of that 

style is known. 

6. Experimental Prog ram 
The fundamental question is how accurate are the K and L models a -stimatas of 

the player's knowledge and learning preferences. To address this quesr ns, we are 

employing 4 different classes of experiment. 

1. Turing Tests : Human players will be analyzed by interviewers to provide L nchmarks 
for the level of modelling that can be achieved by competent hu.ru n teachors. In 
one variation, an accompace will be asked to deliberately simulate ce. uin student 
strategies, and the ability of human observers to detect these strategi . will be 
studied. These Turing Tests will determine if the Psychologist module provides 
modelling performance comparable to human observers. 

^- Articulate Psychologist Experiments : Our rule-based approach to modelling allows 
the Psychologist to explain its hypotheses by reporting which rules were triggered 
and by what evidence. The accuracy of these self-explanations will ju.ljed by 

both the student himself, and an interviewer who observes the student's play and 
discusses his moves with him. 
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3. Closed Loop Experime nts: The game will be played by a modified version of the 
Expert program which employs a sub-optimal strategy. The Psychologist will be 
Judged by whether it diagnoses the strategy. 

4. Predictive Experiments : An overlay model can yield a deterministic procedural model 
of a player by deleting all rules of the Expert with KNOWN = NIL. Th. result is a 
simulated player that can be used to predict the player's perform.... ce . The 
accuracy of these predictions will provide another test of the . vchologlsfs 
success. 

To date, we have carried out informal "articulate psychologist" experiments with 
Wusor-II. Players over a wide spectrum of skill find the comments generated by the 
psychologist module to be comprehensible and reasonable, as evaluated by interviews 
with the players. We have also run closed loop experiments in which an impartial 
player consistently employs a sub-optimal strategy. WUSOR-II successfully diagnoses 
this. We are currently in the process of designing simulated players to serve as 
rigorous closed loop tests. 

We plan over the next 12 month period to run the two most ambitious c asses of 
experiments, Turing Tests and Predictive Experiments. Our subject populati, a will be 
undergraduates enrolled in an education major. (We will be interested both in the 
success WUSOR has in coaching these students and in their reactions to WL^;OR as an 
educational tool.) 

In summary, we are encouraged by reactions of students and teachers to the current 
state of the Coach, but rigorous evaluation of overlay modelling remains to b. done. 

7. Related Literature 
WESTi The WEST program by Burton and Brown [1976] is a computer coach for the 
PLATO game "HOW THE WEST WAS WON". In this game, a player must form from ihrc-e numbers 
an arithmetic expression whose value is either the largest possible . or :ca tonally a 
given number. The educational purpose of the game is to provide ex. rience with 
arithmetic operators and the use of parentheses. C. Resnick [1975] fc ma that many 
students reached plateaus, such that they failed to improve their . .ill, a] ugh they 
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continued to enjoy the game. The WEST coach was designed to discuss less than optimal 
moves with the student in order to move him or her off such a plateau. 

Burton and Brown model the student by contrasting his or her move to the move 
recommended by the expert. USED and APPROPRIATE variables are maintained to record the 
frequency with which different skills are employed. Burton and Brown's development of 
this modelling technique was our starting point. We have extended their approach in 
three ways. 

(1) A syllabus is introduced that organizes the skills in a complexity/depen<|ency 
graph. For complex situations this is required. For simpler domains with a limited 
number of skills, it is less important. For WEST, the syllabus of fig. 8 might have 
been employed, which reflects the usual order in which arithmetic skills arc taught. 



PARENTHESIZATION 



ADDITION ►SUBTRACTION ► ORDERING 




MULTIPLICATION ►DIVISION 



FIG. 8 - A POSSIBLE WEST SYLLABUS 



(2) A P critic is introduced to observe discontinuities and inconsistencies. This 
is Important to observe when the modelling is failing to capture the student's 
behavior. It could readily be applied to the WEST case. 

(3) Multiple sources of evidence are used to increase the window into the 
student's reasoning. WEST relied solely on implicit evidence derived from the 
student's play. A facility for obtaining explicit evidence through follow up questions 
could be incorporated into the WEST coach. 
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BIP: BIP [Wescourt 19/6] is a CAI program for tutoring elementary pr ngramming 
skills. We mention it here to cite an alternative to Expert-based overlay modelling. 
BIP uses a very detailed syllabus as does overlay modelling. But BIP assocues with 
each skill in the syllabus (called the Curriculum Information Network) a set of 
specific exercises and a description of the various correct and incorrect solutions. A 
skill is attributed to the student if he or she succeeds at these exercises. 

The virtue of this approach is that the diagnosis of whether a skill is employed 
is much simpler to make. An elaborate domain expert is not needed. The disadvantage, 
however, is that the tasks are very restrictive, e.g. a typical one might be to "PRINT 
A LITERAL". The greater complexity of overlay modelling with respect tc .n embedded 
Expert program is required to allow free choice by the student in more complex problem 
settings. 

Human Problem Solving: Overlay modelling is a potentially valuable cool for 
information processing psychology. Hence we compare it here done by Mewoll and Simon 
[1972] and their colleagues. Overlay modelling can be used to induce a production 
system model of a human problem solver. The required ingredient is an Ex^^ert that 
analyzes the problem solver's acts in terras of a set of constituent skills - in this 
case a set of productions^ This notion of comparing a problem solving protocol to the 
behavior of a production system is briefly described as the "trace" feature of the PAS- 
II protocol analysis program [Waterman and Newell 1973]. 

In the computer coach context, we have not attempted the level of d. tail in 

modelling that Newell and Simon seek, wherein even eye movements must be accounted for. 

The Coach does not have that much information regarding the student's behavior. 
Indeed, we do not allow unrestricted English interaction. In the PAS- II protocol 

analysis program, English is permitted by making the program interactive -■- i.e. a 

human analyst can aid in interpreting the protocol. Such a solution is not applicable 

to the real time demands made by the computer coaching context. 

Our development of overlay modelling suggests an extension to pro. ction based 

modelling, in the form of the Syllabus. The productions for a given problem iomain can 
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be organized into a network reflecting complexity and dependency. This network then 
suggests the order in which the productions are acquired. 

8. Conclusions 

Overlay modelling constitutes a set of techniques for describing a person's 
problem solving skills in terms of an expert program for the task. These techniques 
are rule systems for monitoring multiple sources of evidence, overlays for structuring 
the model, and a critic for detecting non-linearities. This approach has limitations, 
but it has already shown itself to be useful for maintaining a model of the learner's 
state as part of an Al-based CAI program. 

Ultimately, progress towards an improved theory of modelling will have an 
important impact on the following areas: 

1- In CAI by addressing the critical need to model the learner so as to 
provide high quality personalized instruction. 

2. In education by offering overlays as a structural, non-numerical rrodel of 
the student. 

3. In applied AI by improving the ability of an AI program employ.a as an 
intelligent assistant to generate appropriate explanations for the .. r. 

^* I" theoretical AI by defining criteria such as comprehend. Dili ^ and 
modularity that expert programs should satisfy if they are to be useful as 
part of an Al-based CAI systems. 

®' ^" information processin g psychology by developing a procedural theor for 
inducing models of a subject's problem solving behavior. 
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